# Two-stage Fine-tuning

MN Slush
Slush is a two-stage model trained with high LoRA dropout rate, focusing on enhancing creativity and role-playing capabilities
Large Language Model Transformers
M
crestf411
59
27
ALMA 13B Pretrain
MIT
ALMA is a two-stage trained large language model translation system based on LLaMA-2-13B, significantly improving translation performance through an innovative paradigm of monolingual data fine-tuning + parallel corpus optimization
Machine Translation Transformers
A
haoranxu
3,491
10
ALMA 13B
MIT
ALMA is an advanced translator based on large language models, employing a two-stage training paradigm (monolingual fine-tuning + parallel corpus optimization). The 13B-LoRA version achieves optimal performance through LoRA fine-tuning on the LLaMA-2-13B foundation.
Machine Translation Transformers
A
haoranxu
855
36
ALMA 7B
MIT
ALMA-13B-R is an advanced translation model based on large language models, fine-tuned using Contrastive Preference Optimization (CPO), capable of matching or even surpassing GPT-4 or WMT competition winners.
Machine Translation Transformers
A
haoranxu
256
25
Trocr Large Handwritten Fr
MIT
TrOCR base model for French handwritten text, trained with a two-stage fine-tuning strategy, suitable for single-line text image recognition
Text Recognition Transformers French
T
agomberto
806
1
Dpr Question Encoder Single Lfqa Wiki
MIT
A DPR-based question encoder model specifically designed for long-form QA (LFQA) tasks, optimized for retrieval performance through two-stage training
Question Answering System Transformers English
D
vblagoje
588
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase